Acoustic Modeling of Speaking Styles and Emotional Expressions in HMM-Based Speech Synthesis

نویسندگان

  • Junichi Yamagishi
  • Koji Onishi
  • Takashi Masuko
  • Takao Kobayashi
چکیده

This paper describes the modeling of various emotional expressions and speaking styles in synthetic speech using HMM-based speech synthesis. We show two methods for modeling speaking styles and emotional expressions. In the first method called style-dependent modeling, each speaking style and emotional expression is modeled individually. In the second one called style-mixed modeling, each speaking style and emotional expression is treated as one of contexts as well as phonetic, prosodic, and linguistic features, and all speaking styles and emotional expressions are modeled simultaneously by using a single acoustic model. We chose four styles of read speech — neutral, rough, joyful, and sad — and compared the above two modeling methods using these styles. The results of subjective evaluation tests show that both modeling methods have almost the same accuracy, and that it is possible to synthesize speech with the speaking style and emotional expression similar to those of the target speech. In a test of classification of styles in synthesized speech, more than 80% of speech samples generated using both the models were judged to be similar to the target styles. We also show that the style-mixed modeling method gives fewer output and duration distributions than the styledependent modeling method. key words: HMM-based speech synthesis, expressive speech synthesis, speaking style, emotional expression, acoustic modeling, decision tree

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of various speaking styles and emotions for HMM-based speech synthesis

This paper presents an approach to realizing various emotional expressions and speaking styles in synthetic speech using HMM-based speech synthesis. We show two methods for modeling speaking styles and emotions. In the first method, called “style dependent modeling,” each speaking style and emotion is individually modeled. On the other hand, in the second method, called “style mixed modeling,” ...

متن کامل

HMM-Based Speech Synthesis with Various Speaking Styles Using Model Interpolation

This paper presents an approach to realizing various speaking styles and emotional expressions using a model interpolation technique in HMM-based speech synthesis. In the approach, we synthesize speech with an intermediate speaking style between representative speaking styles from a model obtained by interpolating representative style models. We chose three styles, “reading,” “joyful,” and “sad...

متن کامل

Recent Development of HMM-Based Expressive Speech Synthesis and Its Applications

This paper describes the recent development of HMM-based expressive speech synthesis. Although the expressive speech includes a wide variety of expressions such as emotions, speaking styles, intention, attitude, emphasis, focus, and so on, we mainly refer to the speech synthesis techniques for emotions and speaking styles, which would be the most primary expressions in human speech communicatio...

متن کامل

Style estimation of speech based on multiple regression hidden semi-Markov model

This paper presents a technique for estimating the degree or intensity of emotional expressions and speaking styles appeared in speech. The key idea is based on a style control technique for speech synthesis using multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse process of the style control. We derive an algorithm for estimating pred...

متن کامل

Discrete/Continuous Modelling of Speaking Style in HMM-Based Speech Synthesis: Design and Evaluation

This paper assesses the ability of a HMM-based speech synthesis systems to model the speech characteristics of various speaking styles. A discrete/continuous HMM is presented to model the symbolic and acoustic speech characteristics of a speaking style. The proposed model is used to model the average characteristics of a speaking style that is shared among various speakers, depending on specifi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 88-D  شماره 

صفحات  -

تاریخ انتشار 2005